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Poisson process approximation: 
From Palm theory to Stein's method 
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National University of Singapore and University of Melbourne 

Abstract: This exposition explains the basic ideas of Stein's method for Pois- 
son random variable approximation and Poisson process approximation from 
the point of view of the immigration-death process and Palm theory. The latter 
approach also enables us to define local dependence of point processes [Chen 
and Xia (2004)] and use it to study Poisson process approximation for locally 
dependent point processes and for dependent superposition of point processes. 



1. Poisson approximation 

Stein's method for Poisson approximation was developed by Chen [13] which is 
based on the following observation: a nonnegative integer valued random variable 
W follows Poisson distribution with mean A, denoted as Po(A), if and only if 

E{A/(T/F -I- 1) - Wf{W)} = 

for all bounded / : Z+ ^ R, where Z+ : ={0,1,2,...}. Heuristically, if lE{Xf{W + 
1) - Wf{W)} « for all bounded /: Z+ ^ M, then C{W) « Po(A). On the 
other hand, as our interest is often on the difference ]P(W G A) — Po{X){A) — 
1Ei[1a{W) — Po(A)(A)], where A C Z+ and 1a is the indicator function on A, it is 
natural to relate the function Xf{'w -I- 1) — wf{w) with 1a(w) — Po(A)(j4), leading 
to the Stein equation: 

(1) A/(u; + 1) - wf{w) = Uiw) - Po{X){A). 

If the equation permits a bounded solution /a, then 

P{W e A) - Po(A)(^) = ]E{XfA{W -f 1) - WfA{W)}; 

and 

dTvmW),Po{Xj) : = sup |P(W^ G ^) -Po(A)(^)| 
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As a special case in applications, we consider independent Bernoulli random vari- 
ables Xi, Xn with F{Xi = 1) = 1 - TP{Xi = 0) = p^, 1 < i < n, and 
W = Eti X^,X^ TE{W) = YJUP^■ Since 

n n 
1=1 i=l 

where Wi = W — Xi, we have 

n 

lE{XfA{W + 1) - WfAiW)} = Y,P^^ ifA{W + 1) - fA{W^ + 1)] 

i=l 
n 



where A/^(z) = + 1) — Further analysis shows that \AfA{w)\ < ^—f- 

(see [1] for an analytical proof and ^] for a probabilistic proof). Therefore 



Barbour and Hall Q proved that the lower bound of dry {^{W) , Po(A)) above is of 
the same order as the upper bound. Thus this simple example of Poisson approx- 
imation demonstrates how powerful and effective Stein's method is. Furthermore, 
it is straightforward to use Stein's method to study the quality of Poisson approx- 
imation to the sum of dependent random variables which has many applications 
(see 18 1 or for more information). 



2. Poisson process approximation 

Poisson process plays the central role in modeling the data on occurrence of rare 
events at random positions in time or space and is a building block for many 



other models such as Cox processes, marked Poisson processes (see 2J|), compound 
Poisson processes and Levy processes. To adapt the above idea of Poisson random 
variable approximation to Poisson process approximation, we need a probabilistic 
interpretation of Stein's method which was introduced by Barbour J^]. The idea is 
to split / by defining f{w) = g(w) — g{w — 1) and rewrite the Stein equation ([1]) as 

(2) Agiw) : = X[g{w + 1) - .g(u;)] + w[giw ~ 1) - giw)] = Uiw) - Po(A)(A), 

where A is the generator of an immigration-death process Z^ify with immigration 
rate A, unit per capita death rate, Z^{^) = w, and stationary distribution Po(A). 
The solution to the Stein equation ^ is 

/•OO 

(3) gA{w) = - \ E[lA(Z^(t)) - Po(A)(A)]di. 

Jo 

This probabilistic approach to Stein's method has made it possible to extend Stein's 
method to higher dimensions and process settings. To this end, let F be a compact 
metric space which is the carrier space of the point processes being approximated. 
Suppose do is a metric on F which is bounded by 1 and po is a pseudo-metric on F 
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which is also bounded by 1 but generates a weaker topology. We use Sx to denote 
the point mass at x, let X : — {J2'i=i ^on- ai, . . . ^Oh S F, A; > 1}, B{X) be the 
Borel cr-algebra generated by the weak topology ([23|, pp. 168-170): a sequence 
{£,n} C X converges weakly to ^ G if /p /(a:)^„(dx) — > /p/(x)f(dx) as n ^ oo 
for all bounded continuous functions / on F. Such topology can also be generated 
by the metric di defined below (see [27[, Proposition 4.2). A point process on F is 
defined as a measurable mapping from a probability space (ri,JF, P) to {X,B{X)) 
(see [i^l, p. 13). We use S to stand for a point process on F with finite intensity 
measure A which has total mass A: = A(F), where A(^) = lES(A), for all Borel set 
Act. Let Po(A) denote the distribution of a Poisson process on F with intensity 
measure A. 

Since a point process on F is an valued random element, the key step of 
extending Stein's method from one dimensional Poisson approximation to higher 
dimensions and process settings is, instead of considering Z+-valued immigration- 
death process, we now need an immigration-death process defined on X. More 
precisely, by adapting ([2]), Barbour and Brown define the Stein equation as 

MO ■■ = / m + S:.)~9mMdx)+ I W~5.)~9{m{dx) 
(A\ Jr 
^ ' = h{0-To{X){h), 

where Po(A)(/i) = lE/i(C) with C, ~ Po(A). The operator A is the generator of an 
A'-valued immigration-death process Z^^it) with immigration intensity A, unit per 
capita death rate, ^^(0) = ^ G A:", and stationary distribution Po(A). Its solution is 

/•OC 

(5) guiO = - / nKZS)) - Vo{X){h)]dt, 

Jo 

(see @). 

To measure the error of approximation, we use Wasserstein pseudo-metric which 
has the advantage of allowing us to lift the carrier space to a bigger carrier space. 
Of course, other metrics such as the total variation distance can also be considered 
and the only difference is to change the set of test functions h. Let 

(m " \ i ^ if TO ^ ri, 

■ = S ™iii^ ;^E"iPo(a;j,y^(j)) if to = n > 1, 
i=i j=i J [ if ri = TO = 0, 

where the minimum is taken over all permutations tt of {1, 2, . . . , to}. Clearly, pi is 
a metric (resp. pseudo-metric) if po is a metric (resp. pseudo-metric) on X. Set 

H = {honX: \h{^^) - h{^2)\ < Pi(a, 6) for aU ^i, 6 G A"} . 



For point processes ,=,1 and :^2, define 



P2(/:(Si), C{E2)) : = sup |E/i(Si) - Eft(S2)|, 

hen 

then p2 is a metric (resp. pseudo-metric) on the distributions of point processes if 
pi is a metric (resp. pseudo-metric). In summary, we defined a Wasserstein pseudo- 
metric on the distributions of point processes on F through a pseudo-metric on F 
as shown in the following chart: 
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Carrier space T Configuration space X Space of the distributions 

of point processes 
Po — > Pi — > P2 

(< 1) (< 1) (< 1) 

As a simple example, we consider a Bernoulli process defined as 

71 

j=i 

where, as before, Xi,...,X„ are independent Bernoulli random variables with 
P(Xi = 1) = 1 — P(^i = 0) = Pi, 1 < J < n. Then S is a point process on 
carrier space F = [0,1] with intensity measure A — X]"=iP«'^i- With the metric 
Po{x,y) — \x — yl: — dt)(x,y), we denote the induced metric p2 by d2. Using the 
Stein equation ([?]), we have 

E/i(S) -Po(A)(/i) 



1=1 



where 5,; = S — X^^j.. It was shown in [27|, Proposition 5.21, that 
(6) sup \gh[^ + 5o. + 5p)-gh{^ + 5c.)-9h{^ + 5p)+gh{0\<^+ ^'^ 



/iG-H,a,/9Gr 1^1 + -'- 

where, and in the sequel, |^| is the total mass of ^, A = A(r) = '}2^^iPi- Hence 

d2(£(S), Po(A)) = sup |E/i(S) - Vo{\){h)\ 
hen 

(7) <E^?f^+E. ^ 



since 



„ n 

<T — ^ E^^ 

A - maxi<i<„pi ^ 



E^= ^ =E / z^^<^<^.o*^^' dz 



i<i<"j#« 



(see [23, pp. 167-168). Since d2{C{'E),Vo{X)) > dTy(>C(lS|), Po(A)) and the lower 
bound of dxv (£(|S|),Po(A)) is of the same order as j;J27=iPi 0> 

the bound in ([7]) 

is of the optimal order. 
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3. From Palm theory to Stein's method 



Barbour's probabilistic approacli to Stein's method is based on the conversion of a 
first order difference equation to a second order difference equation. In this section, 
we take another approach to Stein's method from the point of Palm theory. The 
connection between Stein's method and Palm theory has been known to many 
others (e.g., T. C. Brown (personnel communication), [9]) and the exposition here 
is mainly based on and |22] . 

There are two properties which distinguish a Poisson process from other process- 
es: independent increments and the number of points on any bounded set follows 
Poisson distribution. Hence, a Poisson process can be thought as a process pieced 
together by lots of independent "Poisson components" (if the location is an atom, 
the "component" will be a Poisson random variable, but if the location is diffuse, 
then the "component" is either or 1) (HH, p. 121). Consequently, to specify a 
Poisson process N, it is sufficient to check that "each component" N{da) is Poisson 
and independent of the others, that is E{[E7V(da)]g(7V + Sa) ~ N{da)g{N)} ^ 0, 
which is equivalent to 

rs^ n9iN)N{da)] 



for all bounded function g on X and all a G F (see [27|, p. 121). To make the 
heuristic argument rigorous, one needs the tools of Campbell measures and Radon- 
Nikodym derivatives ([2§], p. 83). 

In general, for each point process S with finite mean measure A, we may define 
the Campbell measure C{B,M) = E[S(B)lseAf] for aU Borel C F, M e B{X). 
This measure is finite and admits the following disintegration: 

(9) C{B,M)^ [ QsiM)X{ds), 

JB 

or equivalently, 

Qs{M) = JE[S(rfg)l5gA./] ^ ^ ^(^^^ ^^^^ ^^^^^ 
X[as) 

where {Qs, s £ F} are probability measures on B{X) ([11], p. 83 and p. 164) and are 
called Palm distributions. Moreover, ^ is equivalent to that, for any measurable 
function / : F x A" ^ M+, 

(10) E (^^ /(a, E)E{da)J = j^j^ ^^"^^ OQc.{dOMda) 

for all Borel set B C T. A point process (resp. — Sa) on F is called a Palm 
process (resp. reduced Palm process) of S at location a if it has the Palm distribution 
Qa and, when S is a simple point process (a point process taking values or 1 at 
each location), the Palm distribution >C(Sq,) can be interpreted as the conditional 
distribution of S given that there is a point of S at a. It follows from pO|) that the 
Palm process satisfies 



E J f{a, E)E{da) = E ^ /(a, S„)A(dc 
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for all bounded measurable functions / on F x A:". In particular, S is a Poisson 
process if and only if 

C{Ea) = £(S + 5a), A a.s. 

where the extra point 6a is due to the "Poisson property" of S{a}, and SQ|r\{Q} 
has the same distribution as S|r\{Q} because of independent increments. Here $,\a 
stands for the point measure restricted to A C T ([l^, p. 12). In other words, 
S ^ Po(A) if and only if 

E 1^ /(a, S + Sa)Mda) - ^ f{a, S)S(rfa)| = 0, 

for a sufficiently rich class of functions /, so we define 

DfiO ■■ = ^ fix, e + S,)\{dx) - ^ fix, Oadx). 

If 1ED/(S) « for an appropriate class of test functions /, then is close to 

£(S + Sa), which means that is close to Po(A) under the metric or pseudo- 

metric specified by the class of test functions /. 
If fg is a solution of 

i?/(e)-5(0-Po(A)(g), 

then a distance between and Po(A) is measured by \W,Dfg{E)\ over the class 
of functions g. 

From above analysis, we can see that there are many possible solutions fg for a 
given function g. The one which admits an immigration-death process interpretation 
is by setting 

fix,0 = h{0-hi^-6,), 
so that Df takes the following form: 



where A is the same as the generator defined in section [21 
4. Locally dependent point processes 

We say a point process S is locally dependent with neighborhoods {Aa C F ; a G F} 
if C{E\a-J = C{Ea\A-J, a G F A a.s. 

The following theorem is virtually from Corollary 3.6 in ^] combined with the 
new estimates of Stein's factors in [27|, Proposition 5.21. 

Theorem 4.1. If E is a point process on F with finite intensity measure A which 
has the total mass A and locally dependent with neighborhoods {Aa C F: a G F}. 
Then 



p,mE), Po(A)) < E^^^ (^^ + p^lp^) - mda 



where S^"^ = "^Uq '^"'^ ~ '^/'Uq- 

Remark. The error bound is a "correct" generalization of j X]r=i Pf with the Stein 
factor J replaced by a nonuniform bound. 
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5. Applications 

5.1. Matern hard core process on M'* 

A Matern hard core process S on compact T C is a model for particles with 
repulsive interaction. It assumes that points occur according to a Poisson process 
with uniform intensity measure on T. The configurations of S are then obtained 
by deleting any point which is within distance r of another point, irrespective of 
whether the latter point has itself already been deleted [see Cox & Isham [l7l |. 
p. 170]. 

The point process is locally dependent with neighborhoods {i?(a, 2r): a G F}, 
where B{a^ s) is the ball centered at a with radius s. Let A be the intensity measure 
of S, (io(a, P) = min{|Q! — /3|, 1}, then 

*(r,H,,P„w).o(^M||ip!), 

whereu is the mean of the total number of points of the original Poisson process 
(see [ill, Theorem 5.1). 



5.2. Palindromes in a genome 

Let {/.;: 1 < z < n] be locally dependent Bernoulli random variables, {Ui: 1 < 
i < n} he independent F-valued random elements which are also independent of 
{li : 1 <i < n}, set S = ^i^Ui^ then S is a point process on T. For Ui — i/n 

this point process models palindromes in a genome where li represents whether 
a palindrome occurs at i/n. The point process can also be used to describe the 
vertices in a random graph. 

In general, the UiS could take the same value and one cannot tell which Ui 
and therefore which li contributes to the value. To overcome this difficulty we 
lift the process up to a point process S' = X^ILi '^'^ ^ larger space F' — 

{l,2,...,n} X F. The metric dp becomes a pseudo-metric poj that is, pQ{{i,s), 



(jit)) — dQ{s,t), and S' a locally dependent process (see [1J|, section 4). It turns 
out that the Poisson process approximation of 2 = X^ILi ^i^Ui is a special case of 
the following section. 



5.3. Locally dependent superposition of point processes 

Since the publication of the Grigelionis Theorem [23| which states that the super- 
position of independent sparse point processes on carrier space R+ is close to a 
Poisson process, there has been a lot of study on the weak convergence of point 
processes to a Poisson process under various conditions (see, e.g., [l6, 19, 21] and 
|10|). Extensions to dependent superpositioi|3 of sparse point processes have been 
carried out in [l|, 0, [3, Ell, ll^ • Schuhmacher ^2d\ considered the Wasserstein dis- 
tance between the weakly dependent superposition of sparse point processes and a 
Poisson process. 

Let F be a compact metric space, {S^ : i G 1} be a collection of point processes 
on F with intensity measures Xi, i £ I. Define 2 = X^igx'^^ with intensity measure 



^We use "(resp. locally, weakly) dependent superposition of point processes" to mean that the 
point processes are (resp. locally, weakly) dependent among themselves. 
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A = X^iei'^i- Assume {S^ : i S 1} are locally dependent: that is, for each i € T, 
there exists a neighbourhood Ai C T such that i S Ai and is independent of 

The locally dependent point process S = ^i^Ui can be regarded as a locally 

dependent superposition of point processes defined above. 

Theorem 5.1 (15]). With the above setup, A — A(r), we have 

d2(/:(S),Po(A)) < Eg (^^ + ^^1^^ ^4(V„ V,,„)A,(da) 

where S^'^ = X^j^A '^i' ^ SjeA '^i' reduced Palm process of 

at a, 

P(Vi,„ e Af) = for all M £ B{X) 



an- 



d 



c?i(Ci:6)= min y]do(yj,2:7r(i)) + (»7t-n) 

TT : perrautatio7is of {1, mj 

1 

for ^1 — X^r^i ^^'^ ^2 = X^I^i with m>n [Brown & Xia [1^]]. 

Corollary 5.2 (Q)- ForE — J2ieT^'i'^Ui o>nd A = J2i^xPi defined in section\5^ 

d,(£(S),Po(A))<E^ Y: (^4 + VTi)'''^ 



2.5 



V', + 1 



w/iere = J^j^A, Ij- 

Corollary 5.3 ([l5|). Suppose that {S^ : 1 < i < n} are independent renewal 
processes on [0, T] with the first arrival time of Si having distribution Gi and its 
inter-arrival time having distribution Fi, and let S = X^ieJ'^i ^'^'^ ^ intensity 
measure, then 
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